Practical Relevance Ranking for 10 Million Books
نویسنده
چکیده
In this paper we briefly describe our production environment and some of the open questions about relevance ranking for 10 million books. Then we describe our participation in the Prove It task of the INEX Social Book Search Track. We found that the queries supplied with the Prove It topics were not specific enough to provide good retrieval results. In contrast, the fact fields of the topics, when used as queries, provided good retrieval results. However, our query logs show that users are unlikely to enter queries as long as the fact fields. We tried to create queries that provided good retrieval results but better represented the queries in our logs. We also experimented with simulating the two-stage search process used in our production system when searching the entire corpus of 10 million books to find relevant books and then searching within the book to find relevant pages. While we succeeded in creating queries that were more specific than those supplied in the Prove It topics, and those queries produced better results, questions remain about how representative these created queries are of real user queries.
منابع مشابه
Bibliometrics in Online Book Discussions: Lessons for Complex Search Tasks
Online book discussion forums provide rich information on how readers think about and describe books, how books are related to other books and how people search for and recommend books. Within the Social Book Search (SBS) Lab at CLEF we analyse book search requests on the LibraryThing forums and find several types of complex search tasks where bibliometrics naturally combines with information r...
متن کاملFocused Search in Books and Wikipedia: Categories, Links and Relevance Feedback
In this paper we describe our participation in INEX 2009 in the Ad Hoc Track, the Book Track, and the Entity Ranking Track. In the Ad Hoc track we investigate focused link evidence, using only links from retrieved sections. The new collection is not only annotated with Wikipedia categories, but also with YAGO/WordNet categories. We explore how we can use both types of category information, in t...
متن کاملSpatially - Aware Information Retrieval on the Internet
In this report, we describe a practical relevance ranking procedure, as it is implemented and integrated in the interim prototype of the SPIRIT search engine. We review the theoretical models and ideas presented in the previous three deliverables of WP5, and state the practical decisions and refinements made during implementation. Possible improvements are identified which will lead to an advan...
متن کاملRépondre à des requêtes cliniques PICO
In this paper, we address the issue of answering PICO (Patient/Problem, Intervention, Comparison, Outcome) clinical queries. The contributions of this work include (1) a new document ranking model based on a prioritized aggregation operator that computes the global relevance score based on the relevance estimation of the semantic facet sub-queries and (2) leverages the importance of the facets ...
متن کاملمقایسه جایگاه فعالیت های آزمایشگاهی در کتب درسی زیست شناسی ایران و انگلستان
The aim of this study was to Comparative position of laboratory works in biology textbooks in Iran and United Kingdom. Therefore were selected as sample the biology text book in tenth-grade of both countries, and were analyzed based on the assessment criteria of laboratory works, then was determined the absolute frequency, relative frequency and Percent of relative frequency of each...
متن کامل